71 research outputs found

    DynaVSR: Dynamic Adaptive Blind Video Super-Resolution

    Full text link
    Most conventional supervised super-resolution (SR) algorithms assume that low-resolution (LR) data is obtained by downscaling high-resolution (HR) data with a fixed known kernel, but such an assumption often does not hold in real scenarios. Some recent blind SR algorithms have been proposed to estimate different downscaling kernels for each input LR image. However, they suffer from heavy computational overhead, making them infeasible for direct application to videos. In this work, we present DynaVSR, a novel meta-learning-based framework for real-world video SR that enables efficient downscaling model estimation and adaptation to the current input. Specifically, we train a multi-frame downscaling module with various types of synthetic blur kernels, which is seamlessly combined with a video SR network for input-aware adaptation. Experimental results show that DynaVSR consistently improves the performance of the state-of-the-art video SR models by a large margin, with an order of magnitude faster inference time compared to the existing blind SR approaches

    Decision ConvFormer: Local Filtering in MetaFormer is Sufficient for Decision Making

    Full text link
    The recent success of Transformer in natural language processing has sparked its use in various domains. In offline reinforcement learning (RL), Decision Transformer (DT) is emerging as a promising model based on Transformer. However, we discovered that the attention module of DT is not appropriate to capture the inherent local dependence pattern in trajectories of RL modeled as a Markov decision process. To overcome the limitations of DT, we propose a novel action sequence predictor, named Decision ConvFormer (DC), based on the architecture of MetaFormer, which is a general structure to process multiple entities in parallel and understand the interrelationship among the multiple entities. DC employs local convolution filtering as the token mixer and can effectively capture the inherent local associations of the RL dataset. In extensive experiments, DC achieved state-of-the-art performance across various standard RL benchmarks while requiring fewer resources. Furthermore, we show that DC better understands the underlying meaning in data and exhibits enhanced generalization capability

    Evaluating the Robustness of Trigger Set-Based Watermarks Embedded in Deep Neural Networks

    Full text link
    Trigger set-based watermarking schemes have gained emerging attention as they provide a means to prove ownership for deep neural network model owners. In this paper, we argue that state-of-the-art trigger set-based watermarking algorithms do not achieve their designed goal of proving ownership. We posit that this impaired capability stems from two common experimental flaws that the existing research practice has committed when evaluating the robustness of watermarking algorithms: (1) incomplete adversarial evaluation and (2) overlooked adaptive attacks. We conduct a comprehensive adversarial evaluation of 10 representative watermarking schemes against six of the existing attacks and demonstrate that each of these watermarking schemes lacks robustness against at least two attacks. We also propose novel adaptive attacks that harness the adversary's knowledge of the underlying watermarking algorithm of a target model. We demonstrate that the proposed attacks effectively break all of the 10 watermarking schemes, consequently allowing adversaries to obscure the ownership of any watermarked model. We encourage follow-up studies to consider our guidelines when evaluating the robustness of their watermarking schemes via conducting comprehensive adversarial evaluation that include our adaptive attacks to demonstrate a meaningful upper bound of watermark robustness

    Learning Vehicle Dynamics from Cropped Image Patches for Robot Navigation in Unpaved Outdoor Terrains

    Full text link
    In the realm of autonomous mobile robots, safe navigation through unpaved outdoor environments remains a challenging task. Due to the high-dimensional nature of sensor data, extracting relevant information becomes a complex problem, which hinders adequate perception and path planning. Previous works have shown promising performances in extracting global features from full-sized images. However, they often face challenges in capturing essential local information. In this paper, we propose Crop-LSTM, which iteratively takes cropped image patches around the current robot's position and predicts the future position, orientation, and bumpiness. Our method performs local feature extraction by paying attention to corresponding image patches along the predicted robot trajectory in the 2D image plane. This enables more accurate predictions of the robot's future trajectory. With our wheeled mobile robot platform Raicart, we demonstrated the effectiveness of Crop-LSTM for point-goal navigation in an unpaved outdoor environment. Our method enabled safe and robust navigation using RGBD images in challenging unpaved outdoor terrains. The summary video is available at https://youtu.be/iIGNZ8ignk0.Comment: 8 pages, 10 figure

    Broken Kramers' degeneracy in altermagnetic MnTe

    Full text link
    Altermagnetism is a newly identified fundamental class of magnetism with vanishing net magnetization and time-reversal symmetry broken electronic structure. Probing the unusual electronic structure with nonrelativistic spin splitting would be a direct experimental verification of altermagnetic phase. By combining high-quality film growth and in situin~situ angle-resolved photoemission spectroscopy, we report the electronic structure of an altermagnetic candidate, α\alpha-MnTe. Temperature dependent study reveals the lifting of Kramers{\textquoteright} degeneracy accompanied by a magnetic phase transition at TN=267 KT_N=267\text{ K} with spin splitting of up to 370 meV370\text{ meV}, providing direct spectroscopic evidence for altermagnetism in MnTe

    Deep learning-based statistical noise reduction for multidimensional spectral data

    Full text link
    In spectroscopic experiments, data acquisition in multi-dimensional phase space may require long acquisition time, owing to the large phase space volume to be covered. In such case, the limited time available for data acquisition can be a serious constraint for experiments in which multidimensional spectral data are acquired. Here, taking angle-resolved photoemission spectroscopy (ARPES) as an example, we demonstrate a denoising method that utilizes deep learning as an intelligent way to overcome the constraint. With readily available ARPES data and random generation of training data set, we successfully trained the denoising neural network without overfitting. The denoising neural network can remove the noise in the data while preserving its intrinsic information. We show that the denoising neural network allows us to perform similar level of second-derivative and line shape analysis on data taken with two orders of magnitude less acquisition time. The importance of our method lies in its applicability to any multidimensional spectral data that are susceptible to statistical noise.Comment: 8 pages, 8 figure

    Advanced virtual monoenergetic imaging algorithm for lower extremity computed tomography angiography: effects on image quality, artifacts, and peripheral arterial disease evaluation

    Get PDF
    PURPOSETo investigate the image quality of lower extremity computed tomography angiography (LE-CTA) using a reconstruction algorithm for monoenergetic images (MEIs) to evaluate peripheral arterial disease (PAD) at different kiloelectron volt (keV) levels.METHODSA total of 146 consecutive patients who underwent LE-CTA on a dual-energy scanner to obtain MEIs at 40, 50, 60, 70, and 80 keV were included. The overall image quality, segmental image quality of the arteries and PAD segments, venous contamination, and metal artifacts from prostheses, which may compromise quality, were analyzed.RESULTSThe mean overall image quality of each MEI was 2.9 ± 0.7, 3.6 ± 0.6, 3.9 ± 0.3, 4.0 ± 0.2, and 4.0 ± 0.2 from 40 to 80 keV, respectively. The segmental image quality gradually increased from 40 to 70–80 keV until reaching its highest value. Among 295 PAD segments in 68 patients, 40 (13.6%) were scored at 1–2 at 40 keV and 13 (4.4%) were scored at 2 at 50 keV, indicating unsatisfactory image quality due to the indistinguishability between high-contrast areas and arterial calcifications. The segments exhibiting metal artifacts and venous contamination were reduced at 70–80 keV (2.6 ± 1.2, 2.7 ± 0.5) compared with at 40 keV (2.4 ± 1.1, 2.5 ± 0.7).CONCLUSIONThe LE-CTA method using a reconstruction algorithm for MEIs at 70–80 keV can enhance the image quality for PAD evaluation and improve mitigate venous contamination and metal artifacts

    Applications and benefits for big data sets using tree distances and the t-SNE algorithm

    Get PDF
    Modern data sets often consist of unstructured data and mixed data; that is, they include both numerical and categorical variables. Often, these data sets will include noise, redundancy, missing values and outliers. Clustering is one of the most important and widely-used data analytic methods. However, clustering requires the ability to measure distances or dissimilarities, which are not defined in an obvious way for mixed data. Practitioners often use the Gower dissimilarity for this task. In this work we use tree distance computed using Buttrey’s treeClust package in R, as discussed by Buttrey and Whitaker in 2015, to process mixed data, at the same time handling missing values and outliers. Visualization is also an important method for big data. We use the t-distributed Stochastic Neighbor Embedded (t-SNE) algorithm for visualization introduced by van der Maaten and Hinton in 2008, which produces visualization for high-dimensional data by assigning individual data points in a two- or three-dimensional map. We also use popular visualization techniques grouped under the name multidimensional scaling. We compare the results using the tree distance and the t-SNE algorithm to results from using Gower dissimilarity and multidimensional scaling. Unlike established dimensionality reduction techniques, which generally map from high dimensions directly to two (or three) dimensions, we explore a new approach in which the dimensionality reduction takes place in several separate steps. Our experiments show that our new techniques can outperform the established techniques in producing visualizations of high-dimensional mixed data.http://archive.org/details/applicationsndbe1094548546Captain, Republic of Korea ArmyApproved for public release; distribution is unlimited
    corecore